Recruiting Large Online Samples in the United States and India: Facebook, Mechanical Turk and Qualtrics 

Replication Archive

Taylor C. Boas
Dino P. Christenson
David M. Glick

March 21, 2018

This replication archive contains files necessary to replicate the results in Taylor C. Boas, Dino P. Christenson, and David M. Glick, “Recruiting Large Online Samples in the United States and India: Facebook, Mechanical Turk and Qualtrics,” Political Science Research and Methods, forthcoming. All analysis was conducted in R 3.4.3 on MacOS 10.13.2. 

Please note that the following packages must be installed in order to run the replication code: car, openxlsx, foreign, ggmap, Hmisc, survey, lsr, XML, sp, raster, readstata13.

The replication starts with the raw survey data files. These are the files downloaded from Qualtrics, edited only to remove variables that would compromise privacy (IP Address and browser-related information detected by Qualtrics), are empty/non-varying, or have no analytical value (e.g., the randomly-generated MTurk payment claim code or the Qualtrics-generated panel recruitment tracking variables). We also removed respondents who did not consent to the survey or were ineligible based on age or residence outside India or the U.S. Finally, we removed one survey Preview response (from one of us) that was mixed in with the data.

Starting with the raw survey data files, the R code cleans the survey data, merges in external data, and conducts the analysis. The archive reproduces all tables and figures from the main text and Appendix except for Appendix Figure 1, which contains the Facebook  advertisements used to recruit participants, and Appendix Tables 1 and 2, which summarize the research process and conclusions and do not involve data analysis. Figures are generated in the main directory as PDFs, and tables as .tex files. The correspondence between files and Figure/Table numbers is summarized below:

Main text Figure 1: us_demog.pdf
Main text Figure 2: india_demog.pdf
Main text Figure 3: i_south_fb_heatmap.pdf, i_south_mt_heatmap.pdf, i_south_qt_heatmap.pdf
Main text Figure 4: us_pol1.pdf
Main text Figure 5: us_pol2.pdf
Main text Figure 6: india_pol1.pdf
Main text Figure 7: india_pol2.pdf
Main text Figure 8: coop.pdf
Main text Figure 9: experiments.pdf
Appendix Table 3: us_demog_table.tex
Appendix Table 4: india_demog_table.tex
Appendix Table 5: us_pol_table.tex
Appendix Table 6: india_pol_table.tex
Appendix Table 7: coop_table.tex
Appendix Table 8: us_stddiff_demog_table.tex
Appendix Table 9: india_stddiff_demog_table.tex
Appendix Table 10: us_stddiff_pol_table.tex
Appendix Table 11: india_stddiff_pol_table.tex
Appendix Table 12: states_phi_table.tex
Appendix Table 13: states_cramer_table.tex
Appendix Figure 2: us_states_oursurvey.pdf
Appendix Figure 3: india_states_oursurvey.pdf
Appendix Figure 4: states_phi.pdf
Appendix Figure 5: india_fb_heatmap.pdf
Appendix Figure 6: india_mt_heatmap.pdf
Appendix Figure 7: india_qt_heatmap.pdf

To generate a single PDF document containing all of the properly-formatted (pre-publication) tables and figures from both the main text and Appendix, please compile the file fb_mturk_replication.tex.

While the archive includes R code to replicate the entire analysis, including cleaning the data and merging external data, those who wish only to replicate the tables and figures in the main text and Appendix can start with the file 5_analyze_demographics.R and proceed sequentially from there. The files written by steps 1-4 are included in the archive to facilitate this process. When also replicating the cleaning and merging process, these files will be overwritten with identical files produced by the code in steps 1-4.

The archive contains the following files:


TEXT FILES:

readme.txt: This file

replication_log.txt: The R console for a session where all replication files are run sequentially


LATEX FILES:

fb_mturk_replication.tex: Compiling this will generate a PDF document with all of the tables and figures from both the main text and Appendix


R CODE (run these in order from step 1, or skip to step 5 to skip the cleaning/merging process, as noted above):

1_clean_us_survey.R: Cleans US survey data; writes the file us.RData2_clean_india_survey.R: Cleans India survey data; writes the file india.RData3_merge_external_data_us.R: Merges external data into the US file; writes the file us_completions_augmented.RData4_merge_external_data_india.R: Merges external data into the India file; writes the file india_completions_augmented.RData5_analyze_demographics.R: Replicates Main Text Figures 1-2, Appendix Figures 2-4, and Appendix Tables 3, 4, 8, 9, 12, and 13, as well as results conveyed textually but not in tables or figures 6_analyze_spaces.R: Replicates Main Text Figure 3 and Appendix Figures 5-7 7_analyze_politics.R: Replicates Main Text Figures 4-7 and Appendix Tables 5, 6, 10, and 118_analyze_cooperativeness.R: Replicates Main Text Figure 8 and Appendix Table 79_analyze_experiments.R: Replicates Main Text Figure 9


CODEBOOKS

us_codebook.pdf, india_codebook.pdf: Codebooks for the cleaned survey data produced by the replication code


RAW DATA

india_raw.RData, us_raw.RData: Raw survey data files as described above


CLEANED AND MERGED DATA

us.RData, india.RData: Cleaned survey data files

us_completions_augmented.RData, india_completions_augmented.RData: The cleaned survey data files, subsetted on complete survey responses, with external data merged in.


RAW QUESTIONNAIRES

India_Facebook_Survey_2015.docx, India_MTurk_Survey_2015.docx, India_Qualtrics_Survey_2015.docx, US_Facebook_Survey_2015.docx, US_MTurk_Survey_2015.docx, US_Qualtrics_Survey_2015.docx: Qualtrics-generated questionnaires for the raw survey dataCOMPARISON SURVEYSgss.RData: Variables used in the analysis from the 2014 General Social Survey cross-section

cces.RData: Variables used in the analysis from the 2014 Cooperative Congressional Election Study

cces_augmented.RData: The cces.RData file with external data merged in.
hasel.RData: Variables used in the analysis from the replication data for Haselswerdt, Jake and Brandon L. Bartels. 2015. “Public Opinion, Policy Tools, and the Status Quo Evidence from a Survey Experiment.” Political Research Quarterly 68(3):607–621
kriner.RData: Variables used in the analysis from the replication data for Kriner, Douglas L. and Francis X. Shen. 2016. “Conscription, Inequality, and Partisan Support for War.” Journal of Conflict Resolution 60(8):1419–1445wvs.RData: Variables used in the analysis from the World Values Survey in India (Wave 6, 2014)anes.RData: Variables used in the analysis from the 2012 American National Election Study (non-oversampled face-to-face interviews only)SONS 2006.pdf: Codebook for the 2006 State of the Nation Survey in India
India_NES_2009-post-poll-survey-finding.pdf: Codebook for the 2009 National Election Study in Indiaall-india-findings.pdf: Codebook for the 2014 National Election Study in India (pre-poll wave)All-India-Postpoll-2014-Survey-Findings.pdf: Codebook for the 2014 National Election Study in India (post-poll wave)


INDIA DATA

all_india_PO_list_without_APS_offices_ver2_lat_long.csv: Database of post offices with PIN codes

census_pop_districts.RData: Population by district in the 2011 census

india_polygons.RData: Polygons for drawing maps of India
india_pop_age_state_sex.csv: Population by age, state, and sex in the 2011 censusPINs_geocoded.RData: PIN codes in the survey data, geocoded using Google Maps


US DATA

2014_Gaz_counties_national.txt: Database containing the land area and other geographic statistics of U.S. countiesAgeSexRegion.xlsx: Population by age, sex, and region in the 2010 censusResident Population Data (Text Version) - 2010 Census.html: Population by state in the 2010 census and prior censuses.ruralurbancodes2013.xlsx: Database of Rural-Urban Continuum Codes by countyState Abbreviations.html: State names and two-letter abbreviationsZipToCounty.xlsx: Crosswalk of ZIP codes and U.S. counties
